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ABSTRACT 

We  develop  a  mixed  integer  linear  program  (MILP)  to  maximize  the  information  gain 
from  a  team  of  autonomous  unmanned  vehicles  (UxVs).  Our  modeling  and  algorithmic 
development  enables  UxVs  operating  in  a  decentralized  framework  to  develop  flight 
plans  that  simultaneously  adapt  to  the  perceived  environment  and  support  Intelligence, 
Surveillance  and  Reconnaissance  (ISR)  mission  objectives.  The  mathematical 
formulation  considers  each  UxV’s  perspective  of  the  environment  and  mission,  as 
information  is  only  exchanged  when  UxVs  are  part  of  the  same  communication  network. 
The  main  strategy  is  to  discretize  space  and  time  to  represent  the  potential  information 
gain.  The  mathematical  program  is  used  to  evaluate  the  “Price  of  Anarchy":  the  loss  of 
effectiveness  on  the  system  due  to  the  lack  of  overall  coordination  of  its  resources. 
Network  connectivity  is  represented  in  the  MILP  by  a  set  of  binary  variables.  When 
communication  links  are  added  or  removed  from  the  problem,  the  structure  of  the 
connectivity  matrix  permits  the  identification  of  sub-networks  (i.e.,  connected 
components)  within  the  set  of  UxVs,  allowing  for  an  evaluation  of  system  performance 
with  different  degrees  of  decentralization.  Our  approach  is  innovative  in  proposing  a 
“Perspective  Optimization”  method  as  well  as  to  measure  the  “Price  of  Anarchy”  when  a 
team  of  UxVs  performs  across  multiple  mission-centric  ISR  tasks. 
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1.0  Introduction 

Military  operators  at  all  levels  of  Command  and  Control  (C2)  systems  need  to  focus  their  attention  on  a 
wide  variety  of  important  issues  to  make  critical  decisions  within  operational  timelines.  Autonomous 
unmanned  vehicles  (UxVs)  allow  military  personnel  to  spend  a  greater  percentage  of  their  time  analyzing 
threat  situations,  as  opposed  to  determining  how  to  acquire  information,  resulting  in  timely  decision¬ 
making  that  could  mean  the  difference  between  mission  success  and  failure. 

We  develop  a  mixed  integer  linear  program  (MILP)  to  maximize  the  information  gain  from  a  team  of 
autonomous  unmanned  vehicles  (UxVs).  Our  modeling  development  enables  the  unmanned  vehicle 
system,  consisting  of  the  UxV  equipped  with  on-board  sensors  and  its  off-board  Control  Station  (CS),  to 
develop  flight  plans  that  simultaneously  adapt  to  the  perceived  environment  and  support  Intelligence, 
Surveillance  and  Reconnaissance  (ISR)  mission  objectives  in  a  decentralized  framework.  The 
mathematical  formulation  considers  each  UxV’s  perspective  of  the  environment  and  mission,  as 
information  is  only  exchanged  when  UxVs  are  part  of  the  same  communication  network.  Each  UxV 
perspective  of  the  environment  forms  the  basis  of  our  “Perspective  Optimization”  approach.  The  main 
strategy  is  to  discretize  space  and  time  to  represent  the  potential  information  gain.  The  mathematical 
program  is  used  to  evaluate  the  “Price  of  Anarchy”:  the  loss  of  effectiveness  on  the  system  due  to  the  lack 
of  overall  coordination  of  its  resources.  Network  connectivity  is  represented  in  MILP  by  a  set  of  binary 
variables,  ct  jt.  When  communication  links  are  removed  (i.e.,  ctjt  =  0)  from  the  original  problem  (i.e., 
centralized  framework)  in  which  all  communication  links  are  present  (i.e.,  ctjt  =  1),  the  structure  of  the 
connectivity  matrix  allows  the  identification  of  sub-networks  (i.e.,  connected  components)  within  the  set 
of  UxVs,  allowing  for  an  evaluation  of  system  performance  with  different  degrees  of  decentralization. 
Our  approach  is  innovative  in  proposing  a  “Perspective  Optimization”  method  as  well  as  to  measure  the 
“Price  of  Anarchy”  when  a  team  of  UxVs  perform  across  multiple  mission-centric  ISR  tasks.  The 
developed  mathematical  programming  model  and  applied  concepts  can  be  easily  extended  to  analyze 
other  unmanned  or  manned  systems. 

In  our  model,  each  CS  requires  line-of-sight  to  communicate  with  its  UxV.  The  range  for  this 
communication  is  limited.  UxVs  are  responsible  for  sensing  the  environment.  Each  CS  is  responsible  for 
determining  the  flight  plans  of  its  UxV  over  the  planning  horizon,  considering  mission  objectives,  its 
perspective  of  the  environment,  and  the  potential  collected  information  from  its  UxV.  Each  CS  can  also 
exchange  information  with  other  “neighbor”  CSs.  This  communication,  however,  is  also  limited.  The 
primary  objective  of  routing  the  UxVs  is  then  to  maximize  the  overall  expected  information  gain 
considering  available  sensing  capabilities;  simultaneously,  CSs’  need  to  maximize  the  network 
connectivity  with  other  CSs  to  enable  information  sharing  over  the  planning  horizon  in  the  decentralized 
framework.  Our  proposed  approach  accomplishes  these  objectives,  enabling  unmanned  vehicle  systems  to 
autonomously  develop  and  follow  flight  plans  that,  simultaneously,  adapt  to  the  perceived  environment 
and  support  mission  objectives.  Moreover,  our  approach  allows  us  to  measure  the  “price  of  anarchy” 
when  each  CS  perform  its  own  flight  plan  optimization. 

The  rest  of  this  paper  is  organized  as  follows:  Section  2.1  describes,  in  general,  different  frameworks  for 
planning  and  control  of  autonomous  systems.  Section  2.2  describes  recent  research  for  planning  and 
control  assuming  a  decentralized  framework.  Section  3  describes  our  proposed  mathematical  model.  In 
Section  4  we  introduce  the  concept  of  the  price  of  anarchy.  In  Section  5  the  applicability  of  the  model  to 
measure  the  price  of  anarchy  is  presented.  Finally,  in  Section  6,  conclusions  and  future  research  are 
discussed. 
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2.0  Centralized,  Decentralized  and  Hybrid  Frameworks 

Frameworks  for  planning  and  control  of  autonomous  systems  for  many  cases,  similar  to  information 
fusion  systems,  have  been  classified  into  three  types  of  topologies  [1,  2]:  (1)  centralized,  (2) 
decentralized,  and  (3)  hierarchical  (hybrid). 


Figure  1.  Example  of  a  Centralized  Framework 


Figure  2.  Example  of  a  Decentralized  Framework 
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Figure  3.  Example  of  a  Hybrid  Framework 

In  a  centralized  architecture,  the  information  is  propagated  from  node  to  node  in  the  network  until  it 
reaches  a  “central"  node  responsible  for  determining  and  disseminating  expected  decisions  (e.g.,  path 
definition,  tasks  assignments)  to  all  lower  nodes  (Figure  1).  This  requires  high  computational  burden  at 
the  central  node  and  a  robust  and  reliable  communication  network  that  allows  virtually  perfect 
information  flow  among  all  the  agents  in  the  system  and  the  central  node. 

Decentralized  frameworks  (Figure  2)  rely  on  local  (i.e.,  by  each  agent)  processing  of  information, 
including  information  from  nearby  agents,  and  local  decision-making.  This  framework  reduces  the 
computational  and  communication  requirements  of  a  centralized  framework,  allowing  scalability  of  the 
system  to  large  group  sizes  [3].  Jameson,  in  [4],  compared  a  few  distributed  architectures  based  on  a  set 
of  general  requirements  for  distributed  information  fusion.  In  his  work,  nodes  on  the  network  consisted  of 
fusion  centers  (e.g.,  command  and  control  centers)  and/or  sensors.  For  the  purpose  of  this  discussion,  it  is 
at  these  fusion  nodes  where  planning  and  control  decisions  are  made.  The  first  architecture  analyzed  was 
the  single  composite  picture  implemented  by  the  US  Navy  in  the  Cooperative  Engagement  Capability 
(CEC)  system.  This  decentralized  framework  consists  of  high  speed  communication  links  connecting  peer 
nodes.  Each  node  consists  of  high  quality  sensors.  All  nodes  in  this  architecture  fuse  data  using  the  same 
algorithm  so,  given  the  low  latency  provided  by  the  network,  all  nodes  maintain  virtually  the  same 
“fused”  picture.  A  “fused”  picture  refers  to  the  representation  of  entities  (e.g.,  targets,  assets)  in  the 
environment  by  combining  data  from  multiple  sensors.  As  would  be  expected,  the  requirements, 
particularly  the  communications  bandwidth,  to  maintain  such  an  accurate  and  high-speed  network  are 
substantial.  The  grapevine  architecture  is  also  a  decentralized,  peer  to  peer  architecture  in  which  each 
node  is  capable  of  fusing  the  data  collected  by  local  sensors,  as  well  as  the  data  received  from  peer  nodes. 
At  each  node,  a  Grapevine  manager  is  responsible  for  the  interchange  of  data  with  peer  nodes  to  mitigate 
the  communication  bandwidth  requirements  placed  on  the  CEC  network.  This  manager  evaluates  the 
information  needs  and  capabilities  of  the  peer  nodes  and,  as  data  is  received,  it  is  propagated  to  the 
appropriate  node.  This  is  referred  to  as  an  intelligent  push  of  data.  The  Grapevine  manager  on  each  node 
is  also  responsible  for  communicating  the  local  information  data  needs  to  peer  nodes.  Since  the  peer 
nodes  will  identify  information  to  satisfy  those  needs,  this  is  referred  to  as  an  intelligent  pull  of  data. 
Finally,  Jameson  describes  the  Distributed  Hierarchical  Information  Fusion  architecture.  The  nodes  on 
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this  network  correspond  to  military  units  in  a  command  and  control  hierarchy.  Each  node  is  responsible 
for  propagating  the  collected  information  to  its  parent  and  children  nodes.  Since  data  is  propagated  only 
to  adjacent  fusion  nodes  (Figure  3),  flow  of  information  is  faster  than  in  certain  centralized  architectures 
(e.g.,  Figure  1). 

Most  planning  and  control  algorithms  assume  a  centralized  framework.  When  the  challenges  of 
decentralized  frameworks  are  addressed,  architectures  such  as  the  one  implemented  by  CEC  are  usually 
assumed:  each  UxV  collects  information  from  on-board  sensors  and,  over  a  low  latency  network, 
information  is  exchanged  with  neighbor  UxVs  (i.e.,  peer  nodes).  Information  is  processed  and  trajectories 
are  updated,  as  appropriate. 


2.1.  Algorithms  for  Planning  and  Control  Systems 

Research  in  the  area  of  planning  and  control  systems  has  resulted  in  the  development  of  algorithms 
assuming,  mostly,  a  centralized  framework  [1,  2]:  information  is  collected  in  a  single,  central  node  and 
optimal  or  near-optimal  plans  are  defined  and  communicated  back  to  the  agents  (e.g.,  UxVs)  in  the 
system.  A  smaller  fraction  of  the  research  in  this  area  has  been  concerned  with  the  coordination  of 
resources  in  a  decentralized  environment. 

Jin,  Minai  and  Polycarpou,  in  [5],  considered  two  classes  of  UAVs,  target  recognition  UAVs  and  attack 
UAVs,  for  the  search-and-destroy  problem  over  an  area.  All  UAVs  were  assumed  to  have  sensors  needed 
for  search.  A  distributed  assignment,  mediated  through  centralized  mission  status  information  was 
developed.  At  each  potential  target  location  (environment  was  discretized  as  a  set  of  cells),  UAVs  can 
Search,  Confirm,  Attack,  perform  Battle  Damage  Assessment  (BDA),  or  Ignore.  A  centralized 
information  base  kept  essential  information  updated  for  the  coordination  of  the  UAVs.  Information 
included,  for  each  target  location,  the  target  occupancy  probability,  certainty,  task  status,  and  assignment 
status.  In  addition,  the  information  base  included  state  information  for  each  UAV.  A  set  of  rules,  based  on 
the  information  contained  in  the  information  base,  determined  the  assignment  of  tasks  to  UAVs.  Each 
UAV  accessed  and  updated  the  information  base  at  each  step.  Two  measures  of  performance  were 
considered  to  evaluate  the  proposed  algorithm:  the  time  needed  to  neutralize  all  a  priori  known 
(stationary)  targets,  and  the  total  number  of  steps  needed  to  bring  all  cells  to  the  Ignore  status. 

Shetty,  Sudit,  and  Nagi,  in  [6],  considered  the  routing  of  multiple  unmanned  (combat)  vehicles  to  service 
multiple  potential  targets  in  space.  They  formulated  this  problem  as  a  Mixed-Integer  Linear  Program 
(MILP),  and  decomposed  the  problem  into:  (1)  the  vehicle  to  target  assignment  problem,  and  (2) 
determining  the  tour  for  each  vehicle  to  service  their  assigned  targets.  Each  problem  was  solved  using  a 
tabu-search  heuristic. 

A  modeling  framework  for  the  dynamic  rerouting  of  a  set  of  heterogeneous  vehicles  was  presented  by 
Murray  and  Karwan  in  [7].  Vehicles  were  constrained  by  fuel-  and  payload-  capacity.  The  defined  MILP 
maximizes  overall  mission  effectiveness  and  minimizes  changes  to  the  original  vehicle  task  assignments 
(i.e.,  the  previous  solution).  Tasks  were  characterized  by  priority  values,  service  duration,  limits  on  the 
number  of  resources  that  may  perform  them,  precedence  relationships  among  tasks,  and  multiple  time 
windows  in  which  resources  could  be  assigned  to  the  tasks.  In  addition,  tasks  were  classified  as  required 
or  optional.  Vehicles  were  characterized  by  a  value  indicating  the  resource's  relative  capability  of 
performing  a  task. 
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Several  authors  (e.g.,  [8]  -  [10])  have  taken  an  information-theoretic  approach  to  the  resource  allocation 
problem.  From  this  perspective,  the  purpose  of  the  resource  management  algorithm  is  to  reduce  the 
uncertainty  about  the  environment.  In  [8],  McIntyre  and  Hintz  demonstrated  the  effectiveness  of  this 
approach  for  sensor  management  on  the  problem  of  searching  and  tracking  targets.  For  this  problem,  the 
area  of  operation  was  represented  as  a  grid  divided  into  m  x  n  cells.  Information  about  targets  was 
represented  as  discrete  probability  density  function  (pdf)  on  the  m  x  n  area.  The  pdf  represented  the 
sensors'  estimate  of  the  location  of  the  targets.  Two  types  of  uncertainty  where  considered  on  this 
problem:  (i)  location  of  undetected  targets,  and  (ii)  estimation  of  target  state  vectors.  The  manager 
decided  which  sensor  to  use  and  whether  to  continue  tracking  a  target  (already  represented  as  a  track)  or 
to  search  for  new  ones.  When  a  cell  was  observed  by  a  sensor,  the  amount  of  information  gained  was 
defined  as  the  change  in  entropy  prior  to  and  proceeding  a  sensor  measurement.  The  information  gained 
by  observing  a  cell  on  the  grid  depended  on  the  probability  of  getting  detection  or  not.  The  information 
gained  by  updating  a  (detected)  target  state  vector  considered  a  norm  of  the  respective  track's  error 
covariance  matrix.  The  sensor  management  control  algorithm  consisted  of  comparing  the  potential 
information  gain  from  each  sensor  and  target  combination.  Once  a  target  was  detected,  the  amount  of 
information  gain  was  computed  and  a  decision  on  whether  to  update  the  track  or  to  search  was  made.  If 
search  was  decided,  the  cell  with  the  highest  probability  of  detecting  a  target  was  determined  and  that  is 
where  a  sensor  was  aimed.  Also  using  an  information-theoretic  approach,  Kreucher,  et  al.  in  [10], 
presented  their  results  on  a  decentralized  sensor  management  algorithm. 

Hirsch,  et  al.,  in  [11],  mathematically  formulated  the  problem  of  dynamically  tracking  targets  of  interest 
by  a  set  of  autonomous  UAVs  in  a  centralized  cooperative  control  framework.  A  decentralized  control 
approach  for  UAVs  with  the  goal  of  tracking  moving  ground  targets  was  developed  by  Hirsch,  Ortiz-Pena 
and  Eck  in  [12].  Targets  and  UAVs  were  moving  through  an  urban  domain,  simulated  as  a  set  of 
buildings.  The  shape  and  location  of  each  building  was  assumed  to  be  known  by  each  UAV.  Areas  in  the 
urban  domain  in  which  an  accurate  representation  of  the  ground  targets  was  more  important  were 
represented  by  an  importance  function.  This  function  was  modeled  as  the  sum  of  Gaussian  probability 
density  functions  (each  density  function  represented  an  individual  area  of  importance).  The  vehicles 
operate  in  a  decentralized  manner,  in  which  each  UAV  was  responsible  to  plan  its  route  to  maintain  an 
accurate  representation  of  detected  targets.  A  non-linear  optimization  problem  was  defined  and  solved  at 
each  time  step  of  the  duration  of  the  simulation.  A  Continuous  Greedy  Randomized  Adaptive  Search 
Procedure  (C-GRASP)  [13,  14]  was  utilized  to  approximately  solve  this  optimization  problem. 

The  vehicles  were  modeled  as  non-holonomic  point  masses  on  a  two-dimensional  plane  with  a  minimum 
turning  radius  (i.e.,  a  Dubins  vehicle  [15]),  and  a  minimum  and  maximum  speed.  Communication  among 
UAVs  was  restricted  to  a  maximum  communication  range,  beyond  which  UAVs  could  not  share 
information.  A  minimum  distance  among  UAVs  and  between  buildings  was  also  considered  as  a  collision 
avoidance  mechanism.  UAVs  were  assumed  to  be  flying  at  a  constant  altitude,  below  the  height  of  the 
buildings.  Additional  constraints  in  the  formulation  included  line  of  sight  to  targets  due  to  the  presence  of 
buildings,  formulated  through  the  use  of  Pliiker  coordinates  [16,  17]  and  detection  range  limitations.  Each 
UAV  operated  its  own  dynamic  feedback  loop  in  which,  at  each  time  step,  it  moved  according  to  its 
current  flight  plan,  sensed  the  environment  with  on-board  sensors,  received  and  shared  collected 
information  of  the  environment  with  neighbor  UAVs  (if  any)  and,  when  required,  planned  its  next  flight 
path  for  a  fixed  number  of  time  steps. 

In  [18],  Hirsch,  Ortiz-Pena  and  Sudit  studied  the  effects  of  this  decentralized  control  approach  for  the 
cooperative  tracking  of  ground  targets  in  an  urban  environment,  as  a  function  of  the  number  of  UAVs.  It 
was  shown,  experimentally,  that  the  decentralized  approach  exploits  the  availability  of  multiple  UAVs  by 
defining  routes  that  resulted  in  an  accurate  representation  of  the  targets. 
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In  [19],  Hirsch  and  Ortiz-Pena  extended  their  work  in  [12]  by  considering  the  decentralized  control  of  a 
set  of  autonomous  UAVs  consisting  of  two  sets  of  UAVs  (determined  a  priori ): 

1.  a  set  of  low-level  UAVs  responsible  for  sensing  the  environment  with  the  goal  of  accurately 
tracking  targets  of  interest  in  the  urban  domain,  and 

2.  a  set  of  high-level  UAVs  responsible  for  providing  the  communication  back-bone  for  the 
autonomous  low-level  UAVs  tracking  the  targets  of  interest. 

Low-level  UAVs  were  modeled  as  in  [12],  with  the  additional  constraint  that  they  could  only 
communicate  with  high-level  UAVs.  High-level  UAVs,  also  in  a  decentralized  cooperative  control 
framework,  maximized  the  potential  communication  of  the  low-level  UAVs  by  planning  routes  that  would 
keep  them  within  communication  range  to  both,  low-level  UAVs,  and  other  high-level  UAVs.  A  non¬ 
linear  optimization  problem  was  defined  and  solved  (using  the  C-GRASP  heuristic)  at  each  time  step  of 
the  simulation.  This  optimization  problem  maximized  the  number  of  potential  direct  connections  of  the 
UAVs  over  the  planning  horizon.  High-level  UAVs  were  assumed  to  be  flying  at  a  higher  altitude  than  the 
low-level  UAVs  and  the  buildings.  The  created  communications  back-bone  by  the  high-level  UAVs  was 
the  mechanism  by  which  low-level  UAVs  shared,  with  other  low-level  UAVs,  the  information  on  the 
targets  they  were  tracking. 


3.0  Model  Description  and  Assumptions 

Effective  utilization  of  a  set  of  unmanned  vehicle  systems,  with  limited  capabilities,  is  required  in  order  to 
fulfill  a  set  of  mission  objectives  efficiently.  Following  the  approach  studied  in  [19],  we  considered  a  set 
of  unmanned  vehicle  systems  consisting  of  an  unmanned  vehicle  and  a  control  station.  The  UxV  is 
assumed  to  follow,  autonomously,  the  trajectories  defined  at  its  CS.  It  is  further  assumed  that  the  UxV  has 
to  remain  within  communication  range  of  its  CS  and  cannot  communicate  with  other  CSs  in  the  area  of 
operation.  The  control  station  is  responsible  for  receiving  information  from  the  sensors  on-board  its 
controlled  UxV  and  from  neighbor  CSs.  The  communication  range  to  share  information  with  other  CSs 
might  be  different  than  the  required  communication  range  to  control  and  receive  its  UxV’s  collected 
information. 

In  our  algorithmic  development,  we  assumed  the  effectiveness  of  each  sensor,  for  each  mission,  is  known 
a  priori.  This  effectiveness  may  vary  over  time.  The  area  of  interest  (i.e.,  area  of  operation)  is  represented 
by  a  grid  in  which  each  cell  represents  a  specified  region.  Time  is  discretized  and,  at  each  time  slice,  the 
UxV  may  be  assigned  one  (and  only  one)  cell  of  a  subset  of  all  cells  in  the  grid,  depending  on  its  previous 
assignments. 


3.1.1  Model  parameters 

The  following  parameters  are  defined: 


/  = 
R  = 
K  = 
T  = 

^ikG  = 
xik,-l  = 


set  of  Us  Vs,  indexes  i,/  =  {1,2, |/|] 

set  of  collection  requirements  (i.e.,  mission),  index  r  =  {1,2, |fl|] 

set  of  grid  cells,  indexes  k,  k',k"  =  {1,2, |/sT  |] 

set  of  time  slices,  index  t  =  {1,2, ...,  \T\\ 

f'l  if  UxV  i  is  at  cell  k  at  the  tim  e  of  planning 
U)  otherwise 

(1  if  UxV  i  is  at  cell  k  at  the  time  slice  prior  to  the  time  of  planning 

^0  otherwise 
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wrt  =  weighting  parameter  of  collection  requirement  r  at  tim  e  slice  t 

rjkkf  =  Euclidean  distance  between  cells  k  and  fcr3  rj kkf>0  Vtfer 

CRt  =  communication  range  of  UxV  i's  Control  Station  (CS)  to  UxV3  CB.j>0  Vi 

CRt  =  communication  range  of  UxV  f  s  CS  to  other  CSs3  CRt>  0  Vi 

=;  fi  if  UxV  i  at  cell  ktf  at  time  slice  t-23  at  cell  fat  time  slice  t-i  can  be  assigned  to  cell  k  at  time  slice  t 
to  otherwise 

<Pik,rkrk  _ 

=:  fl  if  CS  i  at  cell  fe"  at  time  slice  £-  2,  at  cell  fe'at  time  slice  t-1  can  be  assigned  to  cell  k  at  time  slice  t 
to  otherwise 

eirkfcp  t 

=  effectiveness  of  the  team  ofUxVs  on  cell  ft  for  collection  requirement  rat  time  slice  t, 
when  UxV  i  is  assigned  to  cell  kr 


3.1.2  Main  Decision  Variables 


^  ikt 

— 

J7ikt 

= 

frkt 

= 

^rk  t 

= 

drkt 

= 

^ijt 

= 

= 

€\jt 

= 

fl  if  UxV  i  is  assigned  to  cell  feat  time  slice  t 
to  otherwise 

fl  if  UxV  Vs  Control  Station  (CS)  is  assigned  to  cell  k  at  time  slice  £ 

IQ  otherwise 

potential  information  gain  from  cell  fe  for  collection  requirement  r  at  time  slice  £ 
increase  in  information  value  from  cell  fe  for  collection  requirement  rat  time  slice  t 
information  gain  from  cell  fe  for  collection  requirement  r  at  time  slice  t 
distance  from  UxV  Vs  CS  to  UxV / s  CS  at  time  slice  t 
value  o  fin  formation  gain  for  collection  requirement  r  by  UxV  /, 

when  CS  iis  connected  to  CS  /  at  time  slice  £ 

fl  if  UxV /' s  CS  and.  UxV  V s  CS  are  within  communication  range  at  time  slice  t 
t-0  otherwise 


3.1.3  Objective  Function  and  Constraints 

a  max  s  ^Vf  ^  I  drkt 

r  t  k 

(1  —  a )  min  max{/rkt}  H- 
r,kft 


P  max 


i  jjti  r  r 


Mrr(-Qj  j.rt 


(1) 
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The  objective  function  in  equation  (1)  consists  of  three  terms: 

(1)  max  which  maximizes  the  potential  information  to  be  gained  by  the  set  of  UxVs; 

(2)  min  inaxP.  k  f  C/rittl  which  minimizes  the  maximim  value  of  potential  information  in  the  area  of 
interest;  this  term  is  required  to  prevent  the  UxVs  from  loitering  on  cells  with  low  potential  information 
gain  until  other  cells  increase  their  value;  and 

(3)  max  SiS/p/si  Sr2fWrrr0.^?-rt,  which  maximizes  the  connectivity  of  CSs  by  considering  the  collected 
information  of  its  UxVs. 


Vi,  Vt 

(2) 


Constraint  (2)  ensures  that  each  UxV  is  assigned  a  (single)  cell  at  each  time  slice  t. 

ii fc'Vfur  =  1,  Vi,  Vt;  k,  k",k' |i£vv*  =  1 

(3a) 


k  kr  k 


+  2.  Vi,vt,  Vfc,  k',k" wik"k'k  =  i 

(3b) 


Constraints  (3)  ensures  that  each  UxV  i  is  assigned  to  a  cell  that  can  be  reached  at  time  slice  t ,  given  its 
assignment  and  at  times  1—2  and  l—  1,  respectively  (see  Section  3.1.1  for  the  definition 

°f  V'ik'Vk)- 


Vi,  Vt 

(4) 


Constraint  (4)  ensures  that  UxV  i  remains  within  communication  distance  of  its  CS. 


Similar  to  constraints  (2)  -  (4),  the  assignment  of  CSs  to  a  cell  k  is  constrained  by 


1, 


Vi,  Vt 

(5) 


where  constraint  (5)  ensures  that  each  CS  is  assigned  a  (single)  cell  each  time  slice  t. 


ZZZ°£k,Vkt 

k  fp  k" 


=  1, 


Vi,Vt;  k,  k" ,  fer  |^3tterrterk  =  1 


(6a) 
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^yikt+yi*'.t-i+yik"*-a 

°ik"k’k  t - ^ - 


Vi,  V£,  Vk,  k' ,k"  \<Pik"k'k  =  1 


(6b) 


Constraints  (6)  ensures  that  each  CS  i  is  assigned  to  a  cell  that  can  be  reached  at  time  slice  t,  given  its 
assignment  ylkrt-2  and  at  times  t  —  2  and  £—  1,  respectively  (see  Section  3.1.1  for  the  definition 

°f  fik'Vk)- 


frkt  —  frk,t— 1  f  ^ 


Cliff £tLt 

potential 

information 

gain 


rkt 


temporal 


geospatiaJ 


(7) 


frkt  in  constraint  (7)  is  the  potential  information  gain  for  collection  requirement  r  from  cell  k  at  time  slice 
t.  It  consists  of  three  components:  for  each  collection  requirement  r,  f.rk^-1  represents  the  potential 
information  remaining  on  cell  k  from  the  previous  time  slice  (i.e.,  potential  information  on  cell  k  once  the 
UxVs  were  assigned  to  a  cell  and  collected  information  at  time  slice  t  —  1);  the  temporal  component  drkt 
represents  the  increase  in  potential  information  from  time  slice  t—  1  to  time  slice  t ;  the  geospatial 
component  grkt  represents  the  information  gained  from  cell  k ,  given  the  assignment  of  UxVs  at  time  slice 
t.  This  is  represented  in  Figure  4.  In  this  figure,  the  path  of  3  UxVs  need  to  be  defined  so  that  the 
information  gain  is  maximized.  Potential  information  gain  is  represented  by  the  color  of  each  cell  (color 
follows  the  spectrum  shown  in  the  figure  in  which  color  blue  indicates  low  information  gain  and  color  red 
indicates  high  information  gain).  At  each  time  slice,  the  current  information  gain  is  increased  by  drkt. 
Each  configuration  of  UxVs  (i.e.,  assignment  of  UxVs  to  cells),  at  each  time  slice,  results  in  an 
information  gain  grkv  decreasing  the  potentially  available  information  for  the  next  time  slice.  The 
solution  to  our  mathematical  programming  model  represents  the  optimal  path  each  UxV  should  follow  so 
overall  information  gain,  over  the  planning  horizon,  is  maximized. 


t  =  l  t  =  2  t  =  3 

+  drkl  +  drh2 


Low  IG  High  IG 


Figure  4.  Relationship  of  Temporal  and  Geospatial  Components 
on  Potential  Information  Gain  Map 
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grkt  can  be  represented  as 

Srkt  '■ ^tr J  (  frk,t—  1  "F 

(8) 

where  erk  (x,.)  £  [0,1]  represents  the  effectiveness  of  the  team  of  UxVs  gaining  information  on  collection 
requirement  r  from  cell  k,  as  a  function  of  the  assignment  xt. 

xt  =  =  vector  of  cell  assignments  at  time  slice  t,  for  all  UxVs.  For  consistency,  erk (x^)  will 

be  represented  by  erkt. 


Using  the  Attenuated  Disk  Model  ([20])  we  can  define  erkt  by 

^rkt  y  y  <  ^jrk.  fc f  t  '  —  L  ^ 

j  r 

grkt  can  then  be  rewritten  as 

y  [  y  [  &Jrkkr  t  '  1  "F  ^rtt} 

J  W 

y  [  y  '  k k*t  '  f  t  '  frk.,t- 1  "F  ^jrk  kft  "  r t '  ^rk f 

J  W 


^jrkk^t  '  ^irkkft  "F  '  Kirkkl’t.'  Vk,  Vt 

*  /  fer 


/  kr 


where  Firickrt  —  -^ikrt  ^rkt* 


For  communication  objectives,  consider 


Decision  variable  ct ,  r  is  defined  by 


Let 


^ijt  — 


=  )  )  Vkk'yiktVfktt* 


(9) 


cijt  — 


i  if  Aiit<CRi  +  CR,  w.  .  .  _  .  _ 
Lfl  ntnarwise 

(10) 


^jrt  —  y  '  y  ejrk  k f  t  ^jfk  k f  tf  Y/ 1  ^ 

k  kT 

(11) 


where  Ujrt  represents  the  contribution  of  UxV  j  to  collection  requirement  r  at  time  slice  /.  Then 
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O 


ifc0-r=l 


0  otherwise 


Vi, /,  /  ^  iij  VV,  Vt 

(12) 


Equations  (1)  -  (11)  were  linearized.  When  a  centralized  framework  is  assumed,  the  set  I  includes  all 
available  UxVs  to  be  assigned  in  the  area  of  interest;  when  a  decentralized  framework  is  assumed,  each 
CS  will  solve  the  programming  model  only  considering  the  other  CSs  within  communication  range. 
While  analyzing  the  effects  of  decentralization  on  solution  quality,  jff  =  0  and  constraints  (9)  to  (11)  are 
not  enforced:  different  configurations  of  UxVs  are  evaluated  (i.e.,  ctjt  =  1  for  some  i,  /  and  t)  without 
considering  the  CS  communication  range  limitations.  Solutions  for  these  configurations  are  then  used  to 
compute  the  price  of  anarchy  (see  Section  4). 


4.0  Measuring  the  Price  of  Anarchy 

As  discussed  in  Section  2.2,  planning  and  control  algorithms  assumed  a  centralized  architecture.  Given 
that  reality,  how  can  a  system  designer  quantify  the  potential  loss  of  system's  efficiency  when  the 
planning  and  control  system  is  implemented  in  a  decentralized  framework  (compared  to  a  centralized 
framework)?  Having  a  mathematical  model  of  the  system,  we  are  interested  in  measuring  the  effects  of 
the  lack  of  an  overall,  vv  central"  controller  to  the  optimal  value  that  would  be  obtained  when  such 
centralized  coordination  is  available.  The  “Price  of  Anarchy”  (PoA)  is  defined  as  a  measure  on  the 
degradation  of  solution  quality  as  a  centralized  system  moves  to  a  more  decentralized  framework.  The 
term  “price  of  anarchy”  was  used  in  [21]  to  refer  to  the  inefficiency  of  a  system  when  individuals  (i.e., 
agents)  maximize  decisions  without  coordination.  Researchers  have  continued  using  this  term  to  refer  to 
the  efficiency-loss  ratio  described  above  [22]  -  [27].  We  will  consider  PoA  to  be  defined  as 


PoA  =  1 


zLr  XJ/  Wrt  firkt 

r  t  ^  ^  lx  9  f  i 


(12) 


where  g*kt  refers  to  the  information  obtained  by  the  solution  of  the  centralized  framework  for  collection 
requirement  r  from  cell  k  at  time  slice  t.  As  described  in  Section  3,  w.rt  represents  a  weighting  parameter 
for  collection  requriement  r  at  time  slice  t. 

As  indicated  above  in  the  proposed  model,  the  network  connectivity  is  represented  by  a  set  of  binary 
variables,  ct  When  communication  links  are  removed  (i.e.,  ctjt  =  0)  from  the  original  problem  (i.e., 
centralized  framework)  in  which  all  communication  links  are  present  (i.e.,  c^t  =  1),  the  structure  of  the 
connectivity  matrix  allows  the  identification  of  sub-networks  (i.e.,  connected  components)  within  the  set 
of  UxVs  (Figure  5).  Effects  on  PoA  will  be  studied  as  each  unmanned  vehicle  system  defines  its  UxV 
flight  plan  based  only  on  the  information  to  be  obtained  from  its  sub-network.  Degrees  of  decentralization 
will  be  represented  by  redefining  the  structure  of  this  connectivity  matrix  and  the  “price  of  anarchy”  will 
be  measured  by  comparing  the  resulting  information  gain  against  the  best-case  “centralized  optimal 
solution”.  grkt  in  (12)  is  the  resulting  information  gain  for  collection  requirement  r  from  cell  k  at  time 
slice  t  by  solving  the  mathematical  programming  model  consisting  of  equations  (1)  -  (9)  for  the  particular 
configuration  network. 
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Figure  5.  Example  of  Resulting  Sub-Networks  when  a  Communication  Link  is  Removed 


5.0  Preliminary  Results 

Based  on  the  concepts  described  in  Sections  3  and  4,  a  simple  simulation  was  implemented  to  show  the 
applicability  and  potential  of  our  mathematical  programming  model  to  study  the  effects  of 
decentralization  on  solution  quality.  We  considered  the  assignment  of  3  UxVs,  particularly  a  set  of 
unmanned  aerial  vehicles,  in  the  area  of  operation  represented  by  a  grid  of  5  X  5  cells  shown  in  Figure  6. 
It  is  assumed  all  UAVs  are  autonomous,  with  a  single  on-board  sensor.  We  considered  a  search  mission 
in  which  the  potential  information  on  each  cell  represents  the  likelihood  of  finding  a  high  value  target  on 
that  cell.  A  low  potential  information  gain  region  is  identified  on  the  area  of  operation  representing,  for 
example,  a  lake  while  looking  for  a  car.  The  planning  horizon  was  assumed  to  be  5  time  slices. 


Figure  6.  Area  of  operation: 

(A)  represents  a  low  potential  information  gain  area; 

(B)  represents  a  high  potential  information  gain  area 
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We  modeled  three  identical  UAVs: 

1)  On-board  sensors  are  assumed  to  be  radars,  having  a  discretized  effectiveness  of  collecting 
information  as  shown  in  Figure  7. 

2)  UAVs  can  only  move  to  adjacent  cells,  not  diagonally. 

3)  All  unmanned  aerial  systems  will  have  the  same  initial  potential  information  gain  map. 


0,0 

0.10 

0,0 

j  0.10 

0.33 

0.10 

0.0 

0.10 

0.0 

Figure  7.  Sensor  Model  (Values  represent  sensor’s  discretized 
effectiveness  acquiring  the  potential  information  of  cell) 


First,  to  study  the  price  of  anarchy,  we  considered  a  centralized  framework  in  which  all  information 
collected  by  each  UAV  is  assumed  to  be  available  at  a  “central”  controller.  This  allows  optimal 
coordination  of  all  UxVs  in  the  area  of  operation  to  maximize  the  information  gain  for  the  mission.  For 
this  case,  in  the  mathematical  programming  model,  the  set  I  includes  all  UxVs.  The  routes  for  each  UxV 
for  this  centralized  framework  are  shown  in  Figures  8a  -  8e.  We  also  considered  the  decentralized 
framework  in  which  each  UxV  is  operating  independently,  with  no  coordination  or  communication 
among  the  team  members.  Each  unmanned  vehicle  system  is  solving  the  mathematical  programing  model 
considering  only  its  own  UxV.  The  routes  for  each  UxV  for  this  decentralized  framework  are  shown  in 
Figures  9a  -  9e. 


Figure  8a. 

Centralized  Solution  (t  =  1) 


Figure  9a. 

Decentralized  Solution  (t  =  1) 
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Figure  8b. 

Centralized  Solution  (t  =  2) 


Figure  8c. 

Centralized  Solution  (t  =  3) 


Figure  8d. 

Centralized  Solution  (t  =  4) 


Figure  9b. 

Decentralized  Solution  (t  =  2) 


Figure  9c. 

Decentralized  Solution  (t  =  3) 


Figure  9d. 

Decentralized  Solution  (t  =  4) 
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Figure  8e. 

Centralized  Solution  (t  =  5) 


Figure  9e. 

Decentralized  Solution  (t  =  5) 


From  Figures  8a  -  8e  note  how  the  centralized,  coordinated  solution,  in  general,  distributes  the 
UAVs  over  the  area  of  interest.  From  t- 1  to  t  =4,  UAVs  are  assigned  to  areas  in  which  their 
sensors’  coverage  do  not  overlap.  At  time  slice  =  5,  when  the  sensor  coverage  of  UAV  1  and  3 
overlaps,  the  potential  information  gain  in  the  area  of  operation  is  relatively  constant  and  low 
(compared  to  the  solution  for  the  decentralized  framework  at  the  same  time  slice,  Figure  9e).  For 
the  decentralized  solution  on  Figures  9a  -  9e,  each  UAV  is  trying  to  maximize  its  own  potential 
information  gain,  with  no  consideration  for  the  effectiveness  of  the  other  UAVs  in  the  area  of 
operations.  In  this  framework,  UAVs  tend  to  travel  to  the  same  area  of  high  potential  information 
gain,  including  visiting  the  same  cell  simultaneously  (see  Figure  9c).  Note  that  the  solutions 
shown  are  the  optimal  allocation  of  UAVs,  solving  the  mathematical  programming  model 
described  in  Section  3  using  CPLEX  Interactive  Optimizer  12.2. 

The  price  of  anarchy  for  different  network  configurations,  representing  different  degrees  of 
decentralization,  is  shown  in  Table  2.  UAV  indexes  within  brackets  represents  connected 
components  (e.g.,  [1,  2]  represents  that  UAVs  1  and  2  communicate).  For  each  decentralized 
configuration,  the  solution  quality  (i.e.,  information  gained  over  the  planning  horizon)  decreased. 
From  Table  2,  the  inability  to  coordinate  the  flight  path  of  UAV  3  (either  with  UAV  1,  [1,  3] [2], 
or  with  UAV  2,  [1]  [2,  3])  resulted  in  the  greatest  loss  of  solution  quality,  indicating  how  PoA  can 
be  used  to  identify  key  communication  links.  Future  research  will  include  an  in-depth  analysis 
and  characterization  of  the  impact  of  the  differences  in  PoA  for  different  configurations  to  other 
measures  of  performance  (e.g.,  number  of  detected  targets,  track  accuracy). 
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Table  2.  Price  of  Anarchy 


UAV  Network 
Configuration 

PoA 

[1,2,  3] 

(Centralized  Framework) 

0% 

[1,2]  [3] 

6.43% 

[1,3]  [2] 

2.80% 

[1]  [2,  3] 

3.80% 

[1]  [2]  [3] 

(Completely  Decentralized 
Framework) 

8.31% 

6.0  Conclusions  and  Future  Needs 

A  mathematical  model  for  the  cooperative  control  of  multiple  autonomous  UxVs  involved  in  ISR 
operations  was  described.  Using  a  small,  simulated  scenario,  our  model  was  applied  to  present  an  initial 
evaluation  of  the  price  of  anarchy  as  a  measure  on  the  degradation  of  solution  quality  as  a  centralized 
system  moves  to  a  more  decentralized  framework.  Future  work  will  provide  a  more  in-depth  analysis  of 
the  price  of  anarchy  and  the  development  of  heuristics  that  will  allow  us  to  solve  the  proposed 
mathematical  model  in  feasible  operational  timelines. 

The  price  of  anarchy  will  be  extended  to  consider  available  bandwidth,  expected  latencies  and  the  value 
of  the  information  flow  present  in  the  networks.  A  categorization  of  UxVs'  missions  should  also  be 
studied  (i.e.,  is  the  price  of  anarchy  higher  on  “search”  missions  than  on  “surveillance”'  missions?).  The 
dynamics  of  targets  and  the  rate  and  requirements  of  new  tasks  on  different  missions  might  require 
contrasting  levels  of  resource  coordination.  Having  this  type  of  analysis  will  be  extremely  beneficial  to 
researchers  developing  new  algorithms  for  autonomous  unmanned  vehicle  systems  and  network 
infrastructures. 
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